Goto

Collaborating Authors

 permutation importance


Multimodality Stacking with Blockwise missing values and application to the PIONeeR biomarkers study for prediction of resistance to immunotherapy

arXiv.org Machine Learning

Integrating multimodal datasets in clinical oncology is frequently hindered by high dimensionality and blockwise missingness, where entire data sources are unavailable for specific patient subsets. Standard survival models often struggle with these gaps, leading to biased results or patient exclusion. We introduce Multimodality Stacking with Blockwise missing values (MSB), a late-fusion framework for survival analysis that independently models modality-specific features before aggregating predictions via a cross-validated stacking meta-learner. MSB was validated on the PIONeeR study (n=443 patients, 378 biomarkers across eight heterogeneous sources) to predict progression-free survival in advanced non-small cell lung cancer patients receiving immunotherapy. MSB yielded higher predictive performance (C-index) than baseline algorithms. Improvements varied by baseline strength: linear models showed a 15.9% increase (p<0.001 for the Wilcoxon signed-rank test), random survival forests gained 5.4% (p=0.002), and gradient boosting methods improved by 2.1% (p=0.030). Beyond discrimination, MSB reduced the generalization gap (train-test difference in 5 folds cross-validation repeated 3 times: 0.055 vs 0.380 for linear models). Permutation importance analysis identified routine laboratory markers, clinical features, and PD-L1 expression as primary predictive drivers. Missing block indicators showed negligible importance, suggesting the model learned from biomarker values rather than data availability patterns. MSB provides a statistically validated framework for multimodal survival prediction with blockwise missingness. By enabling systematic biomarker evaluation without requiring complete data, MSB offers a practical tool for predictive modeling in biomedical research, pending external validation. Implementation is available at https://github.com/MohamedBoussena/MSB under Inria license.



Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction

arXiv.org Machine Learning

Accurate diabetes risk prediction relies on identifying key features from complex health datasets, but conventional methods like mutual information (MI) filters and genetic algorithms (GAs) often overlook extreme dependencies critical for high-risk subpopulations. In this study we introduce a feature-selection framework using the upper-tail dependence coefficient (λU) of the novel A2 copula, which quantifies how often extreme higher values of a predictor co-occur with diabetes diagnoses (target variable). Applied to the CDC Diabetes Health Indicators dataset (n=253,680), our method prioritizes five predictors (self-reported general health, high blood pressure, body mass index, mobility limitations, and high cholesterol levels) based on upper tail dependencies. These features match or outperform MI and GA selected subsets across four classifiers (Random Forest, XGBoost, Logistic Regression, Gradient Boosting), achieving accuracy up to 86.5% (XGBoost) and AUC up to 0.806 (Gradient Boosting), rivaling the full 21-feature model. Permutation importance confirms clinical relevance, with BMI and general health driving accuracy. To our knowledge, this is the first work to apply a copula's upper-tail dependence for supervised feature selection, bridging extreme-value theory and machine learning to deliver a practical toolkit for diabetes prevention.


Unveiling Secrets of Brain Function With Generative Modeling: Motion Perception in Primates & Cortical Network Organization in Mice

arXiv.org Artificial Intelligence

This Dissertation is comprised of two main projects, addressing questions in neuroscience through applications of generative modeling. Project #1 (Chapter 4) explores how neurons encode features of the external world. I combine Helmholtz's "Perception as Unconscious Inference" -- paralleled by modern generative models like variational autoencoders (VAE) -- with the hierarchical structure of the visual cortex. This combination leads to the development of a hierarchical VAE model, which I test for its ability to mimic neurons from the primate visual cortex in response to motion stimuli. Results show that the hierarchical VAE perceives motion similar to the primate brain. Additionally, the model identifies causal factors of retinal motion inputs, such as object- and self-motion, in a completely unsupervised manner. Collectively, these results suggest that hierarchical inference underlines the brain's understanding of the world, and hierarchical VAEs can effectively model this understanding. Project #2 (Chapter 5) investigates the spatiotemporal structure of spontaneous brain activity and its reflection of brain states like rest. Using simultaneous fMRI and wide-field Ca2+ imaging data, this project demonstrates that the mouse cortex can be decomposed into overlapping communities, with around half of the cortical regions belonging to multiple communities. Comparisons reveal similarities and differences between networks inferred from fMRI and Ca2+ signals. The introduction (Chapter 1) is divided similarly to this abstract: sections 1.1 to 1.8 provide background information about Project #1, and sections 1.9 to 1.13 are related to Project #2. Chapter 2 includes historical background, Chapter 3 provides the necessary mathematical background, and finally, Chapter 6 contains concluding remarks and future directions.


Efficient Milling Quality Prediction with Explainable Machine Learning

arXiv.org Artificial Intelligence

This paper presents an explainable machine learning (ML) approach for predicting surface roughness in milling. Utilizing a dataset from milling aluminum alloy 2017A, the study employs random forest regression models and feature importance techniques. The key contributions include developing ML models that accurately predict various roughness values and identifying redundant sensors, particularly those for measuring normal cutting force. Our experiments show that removing certain sensors can reduce costs without sacrificing predictive accuracy, highlighting the potential of explainable machine learning to improve cost-effectiveness in machining.


Model-independent variable selection via the rule-based variable priority

arXiv.org Machine Learning

While achieving high prediction accuracy is a fundamental goal in machine learning, an equally important task is finding a small number of features with high explanatory power. One popular selection technique is permutation importance, which assesses a variable's impact by measuring the change in prediction error after permuting the variable. However, this can be problematic due to the need to create artificial data, a problem shared by other methods as well. Another problem is that variable selection methods can be limited by being model-specific. We introduce a new model-independent approach, Variable Priority (VarPro), which works by utilizing rules without the need to generate artificial data or evaluate prediction error. The method is relatively easy to use, requiring only the calculation of sample averages of simple statistics, and can be applied to many data settings, including regression, classification, and survival. We investigate the asymptotic properties of VarPro and show, among other things, that VarPro has a consistent filtering property for noise variables. Empirical studies using synthetic and real-world data show the method achieves a balanced performance and compares favorably to many state-of-the-art procedures currently used for variable selection.


Augmented Functional Random Forests: Classifier Construction and Unbiased Functional Principal Components Importance through Ad-Hoc Conditional Permutations

arXiv.org Machine Learning

This paper introduces a novel supervised classification strategy that integrates functional data analysis (FDA) with tree-based methods, addressing the challenges of high-dimensional data and enhancing the classification performance of existing functional classifiers. Specifically, we propose augmented versions of functional classification trees and functional random forests, incorporating a new tool for assessing the importance of functional principal components. This tool provides an ad-hoc method for determining unbiased permutation feature importance in functional data, particularly when dealing with correlated features derived from successive derivatives. Our study demonstrates that these additional features can significantly enhance the predictive power of functional classifiers. Experimental evaluations on both real-world and simulated datasets showcase the effectiveness of the proposed methodology, yielding promising results compared to existing methods.


DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks

arXiv.org Artificial Intelligence

We propose a permutation-based explanation method for image classifiers. Current image-model explanations like activation maps are limited to instance-based explanations in the pixel space, making it difficult to understand global model behavior. In contrast, permutation based explanations for tabular data classifiers measure feature importance by comparing model performance on data before and after permuting a feature. We propose an explanation method for image-based models that permutes interpretable concepts across dataset images. Given a dataset of images labeled with specific concepts like captions, we permute a concept across examples in the text space and then generate images via a text-conditioned diffusion model. Feature importance is then reflected by the change in model performance relative to unpermuted data. When applied to a set of concepts, the method generates a ranking of feature importance. We show this approach recovers underlying model feature importance on synthetic and real-world image classification tasks.


Feature Importance and Explainability in Quantum Machine Learning

arXiv.org Machine Learning

Many Machine Learning (ML) models are referred to as black box models, providing no real insights into why a prediction is made. Feature importance and explainability are important for increasing transparency and trust in ML models, particularly in settings such as healthcare and finance. With quantum computing's unique capabilities, such as leveraging quantum mechanical phenomena like superposition, which can be combined with ML techniques to create the field of Quantum Machine Learning (QML), and such techniques may be applied to QML models. This article explores feature importance and explainability insights in QML compared to Classical ML models. Utilizing the widely recognized Iris dataset, classical ML algorithms such as SVM and Random Forests, are compared against hybrid quantum counterparts, implemented via IBM's Qiskit platform: the Variational Quantum Classifier (VQC) and Quantum Support Vector Classifier (QSVC). This article aims to provide a comparison of the insights generated in ML by employing permutation and leave one out feature importance methods, alongside ALE (Accumulated Local Effects) and SHAP (SHapley Additive exPlanations) explainers.


Predictability of Machine Learning Algorithms and Related Feature Extraction Techniques

arXiv.org Artificial Intelligence

To implement machine learning, it is essential to first determine an appropriate algorithm for the dataset. Different algorithms may produce a large number of different models with different hyperparameter configurations, and it usually takes a lot of time to run the model on a large dataset when the model is relatively complex. Therefore, how to predict the performance of a model on a dataset is an fundamental problem to be solved. This thesis designs a prediction system based on matrix factorization to predict the classification accuracy of a specific model on a particular dataset. In this thesis, we conduct a comprehensive empirical research on more than fifty datasets that we collected from the openml web site.